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Abstract 

Let J 7 be a family of Borel measurable functions on a complete separable metric 

space. The gap (or fat-shattering) dimension of J 7 is a combinatorial quantity that 

measures the extent to which functions / £ J can separate finite sets of points at a 

predefined resolution 7 > 0. We establish a connection between the gap dimension 

of J- and the uniform convergence of its sample averages under ergodic sampling. In 

particular, we show that if the gap dimension of J- at resolution 7 > is finite, then for 
^ \ 

every ergodic process the sample averages of functions in T are eventually within IO7 

qs , of their limiting expectations uniformly over the class J- . If the gap dimension of T is 

finite for every resolution 7 > then the sample averages of functions in T converge 
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uniformly to their limiting expectations. We assume only that J- is uniformly bounded 
and countable (or countably approximable) . No smoothness conditions are placed on 
J-, and no assumptions beyond ergodicity are placed on the sampling processes. Our 
results extend existing work for i.i.d. processes. 
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1 Introduction 

Let X be a complete separable metric space, and let J 7 be a countable family of Borel- 
measurable functions / : X — V R. We assume in what follows that T is uniformly bounded 
in the sense that |/(x)| < M for every x £ X and /€ J, where M < oo is a fixed constant. 
Let X = Xi,X2, ... be a stationary ergodic process taking values in X. By the ergodic 
theorem, for each / € J-, the averages m~ 1 YuiLi f(-^i) converges with probability one to 
Ef(X). Of interest here is the limiting behavior of the discrepancy 



T m {F : X) = sup 



1 m 
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which measures the maximum difference between m-sample averages and their limiting 
expectations over the functions in T . 

The discrepancy T m {F : X) and related quantities have been studied in a number of 
fields, including empirical process theory, machine learning and non-parametric inference. 
The majority of existing work considers the case in which X±,X2, ■ ■ ■ are independent and 
identically distributed, but there is also a substantial literature concerned with the behavior 
of the discrepancy for mixing processes (see pQ and the discussion below). Our focus here 
is on the general dependent case: the process X is not assumed to satisfy any mixing 
conditions beyond ergodicity. 

When X is ergodic, the limiting behavior of the discrepancy T m (J r : X) can be summa- 
rized by a single number. As shown in Steele [15], Kingman's subadditive ergodic theorem 
implies that there is a non- negative constant r(J-~ : X) such that 

lim r m (.F : X) -> T{T : X) wpl. (2) 

771— >0O 

We will call T^J 7 : X) the asymptotic discrepancy of T on X, and will omit mention of X 
when no confusion will arise. When Y{T : X) = the sample averages of function / € T 
converge uniformly to their limiting expectations, and J- is said to be a Glivenko Cantelli 
class for the process X. 

In this paper we provide bounds on the asymptotic discrepancy of J- in terms of a 
combinatorial quantity known as the gap dimension that measures the complexity of J- at 
different resolutions or scales. 

Definition: Let 7 > 0. The family T is said to 7-shatter a finite set D C X if there is an 
a£l such that for every Dq C D there exists a function / E T satisfying 

f(x) > a + 7 if x G Dq and f(x) < a — 7 if a; 6 D \ Do 



The gap dimension of T at resolution 7, written dim 7 (J r ), is the largest k such that T 
7-shatters some set of cardinality k. If T can 7-shatter sets of arbitrarily large finite 
cardinality, then dim 7 (J 7 ) = +00. 

The gap dimension was introduced by Kearns and Schapire [9] in a slightly more general 
form. Specifically, they allowed the constant 7 to be replaced by a fixed function g : X — > R. 
We will refer to this notion as the weak gap dimension in what follows. The definition of 
gap dimension given here was suggested by Alon, Ben-David, Cesa-Bianchi and Haussler 
[2], who also established elementary bounds relating the gap and weak gap dimensions. 
Gap dimensions have been referred to by a variety of names in the literature, including 
scale-sensitive dimension and fat-shattering dimension. Our principal result is the following 
theorem. As above, X is assumed to be a complete separable metric space. 

Theorem 1. Let T be a countable, uniformly bounded family of Borel measurable functions 
f : X — > R, and let X be a stationary ergodic process with values in X . If the asymptotic 
discrepancy Y{J- : X) > n for some n > 0, then dim 7 (J-") = 00 for every 7 < 7//10. 

The constant 10 dividing 7 can, with minor modifications of the proof, be improved to 
4 + e, where e is any fixed positive constant. Theorem Q] has the following, equivalent, form. 

Corollary 1. Let J- be as in Theorem^ If dim 7 ( J 7 ) < 00 for some 7 > then r(J r : X) < 
IO7 for every stationary ergodic process. In particular, i/dim 7 (J r ) < 00 for every 7 > 0, 
then r(J-" : X) = for every stationary ergodic process. 

Uncountable Families The countability of T ensures that the discrepancies r m (J r , X), 
m > 1, are measurable. More importantly, countability of T is used in the proof of 
Proposition Q] and is a key assumption in Lemma |Bl Nevertheless, one may readily ex- 
tend Theorem [1] to uncountable families under simple approximation conditions. Call a 
(possibly uncountable) family T nice for a process X if r m (J r : X) is measurable for each 
m > 1, and if for every e > there exists a countable sub-family .Fo C T such that 
lim sup m r m (J 7 : X) < lim sup m T m ( Fq : X) + e with probability one. The conclusion of 
Theorem [1] immediately extends to any ergodic processes X for which T is nice. 

In spite of such extensions, assumptions regarding the countability or countable approx- 
imability of T cannot be dropped altogether, as they exclude extreme examples that can 
arise in the context of dependent processes. We illustrate with a simple example from [T]. 
Let T be an irrational rotation of the unit circle S\ with its uniform measure. Denote by T l 
the i-fold composition of T with itself if % > 1, the z-fold composition of T _1 with itself if 



i < — 1 and the identity if i = 0. For each x G Si let C x = U^_ 00 {T l x} be the (bi-infinite) 
trajectory of x under T, and let T be the family of indicator functions of the sets C x . Note 
that J- is uncountable, and that every set C x has Lebesgue measure zero. For distinct 
points x\,X2 € S\, either C Xl = C X2 , or C X1 n C X2 = 0, and therefore dim 7 (J r ) = 1 for 
< 7 < g- Now let Xi = T 1 Xq, where Xq is uniformly distributed on Si. Then the process 
X = Xi, X2, ... is stationary and ergodic. Moreover, it is easy to see that Ef(X) = for 
each / G T , and that sup^gj-m -1 YaLi /PQ) = 1- Thus T m {F : X) = 1 with probability 
one for each m > 1, and the conclusion of Corollary [1] fails to hold. 

1.1 Related Work 

Vapnik and Chervonenkis [18] gave necessary and sufficient conditions for uniform con- 
vergence of sample means in the i.i.d. case. Specifically, they showed that if X is i.i.d., 
then r(J r : X) = if and only if n _1 log N(e, T, Xf) — > in probability for every e > 0. 
Here N(e, T , X™) is the number of e-balls needed to cover T under the empirical Li metric 
d(fi,f2) = n~ l Y^i=i \fl(Xi) — f2(Xi)\. Extensions of these results to empirical processes 
can be found, for example, in Gine and Zinn [8] (see also Dudley [7]). 

Talagrand [16] gave necessary and sufficient conditions for uniform convergence of sample 
means, which are different than those of [18J. He showed that T(J- : X) > for an i.i.d. 
process X with X{ ~ P if and only if there exists a set A with P(A) > and 7 > such 
that for every n > 1 the family T 7-shatters P n -almost every sequence xi,...,x n S A n . 

Alon et al. [2] considered the relationship between the gap dimension and the learnability 
of classes of uniformly bounded functions under independent sampling. In particular, they 
showed that if J 7 is a family of functions / : X — > [0, 1] satisfying suitable measur ability 
conditions, and such that dim 7 (J 7 ) is finite for some 7 > 0, then 
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sup P( supr m (J":X) >e) 

iel(X) \m>n J 



(3) 
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when e = 487. Here X(Af) is the family of all i.i.d. processes taking values in X . Conversely, 
if dim 7 (J 7 ) = +00, they showed that ([3|) fails to hold for every e < 27. Further connections 
between the gap dimension and different notions of learnability (in the i.i.d. case) can be 
found in |3j and the references therein. Talagrand [IT] and Mendelson and Vershynin |11] 
showed that the L2 covering numbers of a uniformly bounded sets of functions can be 
bounded in terms of its weak gap dimension. 

In addition to the papers cited above, there are a number of results on uniform conver- 
gence for dependent processes satisfying a variety of standard mixing conditions; a discussion 



of these results can be found in [1] . In related work, Rao [13] and Billingsley and Tops0e 
[6] studied and characterized classes of functions T such that supj- | J fdP n — J dP\ — > 
whenever P n converges weakly to P. As noted in [6j, the elements of such uniformity classes 
are necessarily continuous almost everywhere with respect to P. Bickel and Millar [1] pro- 
vided sufficient conditions for a more general notion of uniformity, and revisited several of 
the results in earlier papers. 

Adams and Nobel [lj established Theorem Q] in the special case where the elements of 
J- are indicator functions of subsets of X . The problem simplifies in this case, as dim 7 (J-") 
is zero for 7 > 1/2, and equal to the VC-dimension of T if < 7 < 1/2. If T has finite 
VC-dimension, their results imply that T(J- : X) = for every ergodic process X. For 
uniformly bounded families T they show that Y{T : X) = for every ergodic process X if 
dimo(J r ) < 00, or if J 7 is a VC-graph class (c.f. [12J). 

1.2 Overview 

The proof of Theorem [T] is based on the direct construction of 7-shattered sets of arbitrarily 
large cardinality. In particular, the proof does note make use of results or techniques from 
the study of uniform convergence in the i.i.d. case. The core of the construction, which is 
contained in Section [5] below, follows the arguments in pp. 

In the next section we reduce Theorem [1] to an analogous result with X is equal to the 
unit interval. This equivalent result is stated in Theorem [2j Section [3] contains several 
preliminary definitions and Lemmas used in the proof of Theorem [2l The proof of Theorem 
[2] is presented in Sections H] - [71 Section H] gives an outline of the proof of the theorem. The 
proofs of two key propositions are given in Sections [5] and [6l The diagram below provides 
an overview of the proof. 

Theorem Q] •<= Theorem [2] <^= Proposition [2] + Lemma Q] + Lemma IB1 

if 
Proposition [T] + Lemma [2] 



2 Reduction to the Unit Interval 

Let X and T be as in Theorem Q] and let X be an A'-valued ergodic process, defined on an 
underlying probability space (fl, A, P), such that T(T, X) > rj > 0. By assumption, there 



exists a number < M < oo such that |/| < M for each / G T . Replacing / G T with 
/' = (/ + M)/2M, we may assume without loss of generality that each / G T takes values 
in [0, 1]. The proof of the following lemma, which relies on elementary ergodic theory, is 
similar to that of Lemma 5 in [1], and is omitted. 

Lemma A. Let X be a stationary ergodic process with values in X. IfY(J- : X) > n > 0, 
then X is necessarily uncountable, and there exists a stationary ergodic process X with 
values in X such that P(Xj = x) = for each x G X and T(J r : X) > n. 

Let //(•) be the marginal distribution of X. By Lemma |A"1 it suffices to establish Theorem 
[1] in the case where X is uncountable, and //(•) is non-atomic. Let A(-) denote ordinary 
Lebesgue measure on the unit interval [0, 1] equipped with its Borel subsets B. By standard 
results in real analysis (c./. Theorem 5.16 of [2]), there is a measure space isomorphism 
between (X,S,/i) and ([0, 1],B, A). More precisely, there exist Borel measurable sets Xq C 
X and Iq C [0, 1], and a bijection -0 : <^o — ► ^o with the following properties: (i) /j,(Xq) = 
A(/o) = 1; (ii) ip and ip~ 1 are measurable with respect to the restricted sigma algebras SC\Xq 
and B n Jo, respectively; and (iii) /x(A) = \(ip(A)) for each ^4 G S D A^o- In particular, the 
event E 1 = {Xj G ^q f° r some * ^ 1} has probability zero. By removing E from the 
underlying sample space, we may assume without loss of generality that Xi(u) G Xq for 
each sample point ui and each i > 1. 

Define Y^ = ip(Xi) for i > 1. Then the process Y = Y±,Y2, ... € [0, 1] is stationary and 
ergodic with marginal distribution A. For each function / G T define an associated function 
/ : [0, 1] -+ [0, 1] via the rule 

f (/oV^Xu) ifue/o 
j otherwise 

and let T = {/ : / G J-}. It is easy to see that f(Yi) = /(Xj), and in particular, that 
Ef(Y) = Ef(X). Thus T m (F :Y) = T m (F: X) with probability one for each m > 1. 
Moreover, if k distinct points ui,...,Uk G [0, 1] are 7-shattered by J 7 , then necessarily each 
Uj G Iq, and the (distinct) points il)~ l {u\), . . . , il)~ l (uk) G Af are 7-shattered by J 7 . It follows 
that dim 7 (J-") < dim 7 (J r ). Theorem Q] is therefore a corollary of the following result. 

Theorem 2. Let T be a countable family of Borel measurable functions f : [0, 1] — > [0, 1], 
and let X = X\,X2, ■ ■ ■ G [0, 1] be a stationary ergodic process with Xi ~ A. If the asymptotic 
discrepency T(F : X) > rj > then dim 7 (J r ) = 00 for every 7 < 77/ 10. 



3 Preliminaries 

In this section we define three elementary notions that will be used in the proof of Theorem 
[2j The first is the segments of a function / : [0,1] — > [0,1]. The second is the join of a 
sequence of families of disjoint sets. The third is an ancestral set in a binary tree. Lemma Q] 
establishes a simple connection between joins, segments and the gap dimension. Lemma [2] 
provides a useful bound for obtaining a subtree with good ancestral properties from a large 
initial binary tree. 

3.1 Segments and Regular Families 

Let T and X be as in the statement of Theorem El and suppose that Y{T : X) > r\ > 0. 
Assume without loss of generality that rj is rational, and let 7 = rj/5. Let K = [7 _1 J + 1 if 
7 _1 is not an integer, and K = 7 _1 otherwise. For each f € J- and 1 < k < K define sets 

/- 1 [(/c-l)7,A;7) i£l<k<K-l 
s k (f) { ~ (4) 

f- x [{K- 1)7,1] if k = K. 

Definition: The sets Sk(f) will be called ^-segments of /. Let 7r(/) = {sk(f) '■ 1 < k < K} 
be the partition of [0, 1] generated by the 7-segments of /. Two segments s&(/) and Sk'(f) 
will be called adjacent if they correspond to adjacent intervals, equivalently if \k — k'\ = 1, 
and non- adjacent if \k — k'\ > 2. 

In order to establish Theorem El we first consider families T whose elements satisfy a 
topological regularity condition. Given a family T of functions / : [0, 1] — > [0, 1], define the 
associated collection of sets 

C(F) = {f~ l [a,b) : < a < b < 2 rational, and / € J 7 }. (5) 

Including values b > 1 ensures that C(J-) contains sets of the form / _1 [a, 1]. Note that 
C{J-) is countable if T is countable. 

Definition: A family T of measurable functions / : [0, 1] — > [0, 1] is regular if it is countable, 
and each element of C{J-) is a finite union of intervals. 

3.2 Joins and the Gap Dimension 

In ergodic theory, the join of a finite collection of sets contains the atoms of their generated 
field. Here we employ a minor generalization of this notion. 



Definition: Let T>\, . . . , D/t be finite families sets in [0, 1] such that the elements of each 
family are disjoint. The join of T>\, . . . ,T>k, denoted Vi=i £>i or Z?i V- • -VD/., is the collection 
of all non empty intersections D\ fl • • • fl D^ where Di G T>i for i = l,...,k. 

The next lemma establishes a useful connection between the gap dimension of J- and 
the join of non-adjacent segments of functions / G T . Its proof is based on similar results 
in [10] and [p. 

Lemma 1. Suppose that for some L > 1 there exists a sub-family J-q C J of 2 functions, 
and a pair k, k' G [K] of non- adjacent integers such that the join 

j = V w/w(/)i 

f€T 

of non-adjacent ^-segments has cardinality 2 2 . Then diim,^-^") > L. 

Remark: The conditions of the lemma ensure that each of the possible intersections con- 
tained in J is non-empty, and therefore J has maximum cardinality. 

Proof: Indexing the elements of J-q in an arbitrary manner by subsets of [L] := {1, . . . , L}, 
we may write J-q = {f a : a C [L]}. For i = 1, . . . , L, let Xj be any element of the intersection 

n s *(fa)) n ( n sfc '^)) ' 

*C[L],iea J \aC[L],iga J 

which is non-empty by assumption. Suppose without loss of generality that k < k', and let 
c = j(k + k' — l)/2. Let /3 be any subset of [L] and consider the corresponding function 
f/3 G Tq. If i G j3, the selection of x% ensures that X{ G Sfc(fp), and consequently fp(xi) < 
7A; < c — 7/2. On the other hand, if i G /3 C then x« G Sk'(fp), and in this case fp(xi) > 
j(k' — 1) > c + 7/2. As /3 was arbitrary, it follows that dim 7 / 2 (J r ) > L. 

3.3 Binary Trees and Ancestral Sets 

Binary trees appear in several key results of the paper. Throughout we consider standard 
binary trees T that have a single root, which is assumed to be located at the top of the 
tree. Vertices of T are referred to as nodes, and usually denoted by s or t. Each node of T 
has either zero or two distinct children and, with the exception of the root, a single parent. 
A node with two children is said to be internal; a node with no children is called a leaf. 
The set of leaves in a tree T will be denoted by T. A descending path in T is a sequence 
of adjacent nodes that proceeds only from parent to child. The depth, or level, of a node 



t € T is the length of the shortest (necessarily descending) path from the root to t. The 
set of nodes at level r of T will be denoted T[r\. The depth of T is the maximum depth of 
any node in T. We will exclusively consider trees of finite depth, say L, that are complete 
in the sense that T[r] contains 2 r nodes for r = 0, . . . , L. In this case, T = T[L] and each 
node t G T[r] with < r < L — lis internal. 

Definition: Let T be a binary tree. A node s in T is an ancestor of a node t if there is a 
descending path in T from s to t of length greater than or equal to one. A node s will be 
called an ancestor of a set A C T if s is an ancestor of some t € A. 

The next Lemma establishes a pigeon-hole type result showing that any large collection 
of leaves must have a correspondingly large set of ancestors in some nearby level of the tree. 

Lemma 2. Let T be a full binary tree of depth L, and let T denote the 2 leaves of T. 
Suppose that there exists a set of leaves S C T and a constant < c < 1 such that \S\ > 
c2 > 4. Let u = |~log 2 c _1 + 1] . Then there exists a set S' Q T[Iq] with L — u < Iq <L — \ 
such that for each node s £ S' both of its children are ancestors of S, and 

c1 L 

i*i > Sr m 

Proof: For / = 1, . . . , L — 1, let mi be the number of nodes s at level I that are the ancestor 
of some node t £ S, and let n/ be the number of nodes at level I with the property that 
both their children are ancestors of a node t £ S. It is easy to see that \S\ = itil-i + til-i, 
and more generally we have 

L-l 

\S\ = rriL-v + riL-v + riL-v+i H 1- n L-i < 2 L ~ V + ^ m 

l=L-v 

for v = 1,. . . ,L — 1. Setting v = u, the assumption that |<S| > 02^ yields 

L-l 

J^ n z > c2 L - 2 L ~ U = 2 L ~ u (c2 u - 1) > 2 L ~ M , 

l=L— u 

where the last inequality follows from the definition of u. Let ni be the largest value of n\ 
appearing in the sum above, and let S' be the nodes at level Iq of T with the property that 
both their children are ancestors of S. Then 

,„„ 2 L ~ U c2 L c2 L 

\S'\ = n /n > > — - > — 

11 ° ~ u ~ 4u ~ 4L 

where the second inequality follows from the definition of u. 



4 Outline of the Proof of Theorem [2] 

In this section we present an outline of the proof of Theorem [2j We begin with Proposition 
[H which is the key result of the paper. The proposition shows that if T is regular and 
Y{J- : X) > then one can associate the nodes of an arbitrarily large binary tree with 
segments of select functions in T in such a way that (i) the intersection of segments along 
every path from the root to a leaf is non-empty, and (ii) sibling segments are non-adjacent. 
The resulting structure will be called an intersection tree. 

Proposition [2] refines Proposition [1] using the pigeon-hole principle from Lemma [2j It 
ensures that for every finite L > 1 there is a family of L functions in T having non- adjacent 
segments with maximal join. The final step in the proof of Theorem [2] is to remove the 
regularity condition on T . This is done by means of a measure space isomorphism described 
in Lemma [Bj The proof of Theorem [2] appears in Section [JJ 

4.1 Intersection Trees 

Proposition 1. Let T and X be as in Theorem^ Suppose that T(J- : X) > r/ > and 

that J- is regular. Then for each L > 1 there exists functions gi,...,gL 6 J" and a complete 
binary tree T of depth L such that each node t £ T is associated with a subset Bt of [0, 1] 
in such a way that the following two conditions are satisfied. 

(a) For each internal node t € T at level I, the sets B t i and B t n associated with its children 
t' and t" are equal to non-adjacent segments of gi+i- 

(b) For each node t E T, the intersection Wt of the sets B s appearing along a descending 
path from the root to t has non-empty interior. 

The proof of Proposition Q] is given in Section [5j 

4.2 Maximal Joins 

Proposition 2. Let T and X be as in Theorem [H Suppose that T(J- : X) > r/ > and 
that J- is regular. Let 7 = n/5. For each L > 1 there are functions /1, • • • ,/l G J~ and a 
pair k,k' € [K] of non- adjacent integers such that the join 

J = W/i),M/i)}v---v{ Sfc (/ L ),M/L)} 

of non-adjacent ^-segments has (maximum) cardinality 2 L , and every element of J has 
positive Lebesgue measure. 

The proof of Proposition [2] appears in Section [6] below. 

10 



4.3 Removing Regularity 

Together, Lemma Q] and Proposition [2] establishes Theorem [2] in the special case of regular 
families. In order to remove the assumption of regularity, we require the following result, 
whose proof can be found in pQ. 

Lemma B. Let C = {C\,C2, • • •} be a countable collection of Borel subsets of [0, 1] such 
that the maximum diameter of the elements of the join J n = \/™ =1 {Ci,C![} tends to zero 
(tsfi->oo, Then there exists a Borel-measurable map (ft : [0, 1] — > [0, 1] and a Borel set 
Vi Q [0, 1] of measure one such that: (i) <fi preserves Lebesgue measure and is 1:1 on V\; 
(ii) the image V2 = 4>(Vi) and the inverse map <f>~ 1 : V2 — > V\ are Borel measurable; (Hi) 
4>~ l preserves Lebesgue measure; and (iv) for every set C £ C there is a set U(C), equal to 
a finite union of intervals, such that \(4>(C) AU (C)) = 0, where A is the usual symmetric 
difference. 

Remark: Lemma [B] is applied to the family of sets C = C{T). The existence of the 
isomorphism <f> requires that C be countable, and this leads to the requirement that J- be 
countable as well. 

The proof of Theorem [2] is given in Section [7] below. 

5 Proof of Proposition [I] 

Construction of the intersection tree in Proposition Q] is based on a multi-stage procedure 
that is detailed below. At the first stage, we produce a refining sequence J\, J2, ... of joins 
in [0, 1] and simultaneously identify a sequence of functions f\,fi, ■ ■ ■ £ J 7 . The join J n is 
generated from selected non-adjacent segments of /1, . . . , f n . The function f n +\ chosen at 
step (n + 1) is an element of J- whose average differs from its expectation by at least n on 
a sample sufficiently large to ensure that the relative frequency of every element A € J n 
is close to its probability. From J n and f n +i we identify a set G n equal to the union of 
the cells in J n on which the average of f n +i is far from its expectation. The sets G n are 
used, in turn, to produce a limiting "splitting" set Ri via a weak convergence argument. 
This sequential process is repeated in subsequent stages, with the important feature that 
the splitting sets R\, . . . , R s -i identified at stages 1, . . . , s — 1 are used to generate the joins 
and the splitting set at stage s. 

The proof of Proposition Q] follows the proof of Proposition 3 in [TJ. The earlier propo- 
sition treats the special case in which the elements of J- are indicator functions of sets, 

11 



and hence binary valued. The definition and construction of the splitting sets R s follow 
the arguments in the binary case, the principal difference being that the generalized joins 
defined here involve segments rather than sets. The proof of Lemma U] below and the three 
displays preceding it are identical to arguments in pQ . Differences in the proofs emerge from 
the focus here on non-adjacent segments. In particular, the use of intersection trees or a 
similar hierarchical structure appears to be required, and the arguments that follow Lemma 
H] are somewhat more involved than in the binary case. 

The proof of Proposition [1] requires that one carefully keep track of the quantities ap- 
pearing at each step and stage of the construction, and how these quantities are defined. 
For this reason, and due to the differences discussed above, it is not possible to substantially 
shorten the proof Proposition [1] by an appeal to the earlier results. We provide a detailed 
argument below for completeness. 

5.1 Initial Construction 

Let J 7 be a countable family of Borel measurable functions / : [0,1] — > [0,1], and let 
X = Xi, X2, ■ ■ ■ S [0, 1] be a stationary ergodic process defined on an underlying probability 
space (Cl,A,¥) such that Xi ~ A. Assume that T(J- : X) > 77 > 0, and that every element 
of C{J-) is a finite union of intervals. Let 5 = rj/12, and note that < S < 1. For each n > 1 

let 

V n = {[k2~ n , (k + 1) 2~ n ) : < k < 2 n - 2} U {[1 - 2" n , 1]} 

be the nth order dyadic subintervals of [0, 1], and let V = \J n >iD n . The set Aq consisting 
of the endpoints of the intervals from which the elements of C(J-) and T> are constructed 
is countable, and therefore has Lebesgue measure zero. Removing a P-null set of outcomes 
from O, we may assume that Xi(u) G Aq for each w G fi and for every i > 1. (This 
assumption is used in the last part of the proof.) 

Below we identify a sequence of splitting sets R\, R2, ■ ■ ■ ^ [0, 1] in stages, and then use 
these sets to construct the intersection tree. 

Stage 1. The first stage of the construction proceeds as follows. Let /1 be any function 
in J 7 , and suppose that functions /i,...,/ n € T have already been selected. Let J n = 
T^n V 7t(/i) V • • • V vr(/ n ) be the join of the dyadic intervals of order n and the 7-segments 
of the previously selected functions. Here and in what follows we take 7 = r//5. For each 
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to € fi, each function g : [0, 1] — > [0, 1], and each m > 1, define the (pointwise) discrepancy 



A"( 5 : m) 



771 ^— ' 



j=l 



(7) 



which measures the difference between the expectation of g{X) and its average over the 
sample sequence X\{ui), . . . , X m (uj). From the ergodic theorem and Proposition [21 it follows 
that there exists a sample point w n+1 G Q, an integer ra n+ i > 1 and a function f n+ \ £ J 7 
such that 

^wn+i (/ A . mn+1 ) < 5 A(A) for each 4eJ n (8) 

and 

A"" +1 (/n+i:m n+1 )>r / . (9) 

Defining the join J n +i = D n V7r(/i) V • • • V7r(/ n+ i) and continuing, we may select functions 
/n+2i /n+3) ... G J 7 in a similar fashion. 

The relations (|8|) and ([9]) together ensure that for many cells A £ J n the average of / n +i 
on yl differs from its expectation over A. To make this precise, define the family 

H n = [A€J n : A^+H/n+i • Ia : m n+x ) > ~ A(A)} . 

As the next lemma shows, the sets in H n C J ra occupy a non-trivial fraction of the unit 
interval. 

Lemma 3. If G n = L)H n is the union of the sets A € H n , then \{G n ) > 77/6. 

Proof: To simplify notation, let U) = ui n +i-, f = fn+i, arid m = m n+ i. Decomposing 
A w (/ : 771.) over the elements of J n and applying the triangle inequality, we obtain the 
bound 

77 < £ A w (/-JA:m) + Y, A w (/.J A :m). 

By definition of -ff n , the second term is at most r?/2. The first term is at most 





^4 : m) 




s E 

Aeff„ 


1 m 
771 ^^ 


£ E 

AG/7; n 


771 ^—f 

1=1 





< E A "(A--m) + 2\(G n ) 

AeH n 

< (5 + 2)X(G n ) < 3A(G n ). 
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where the first inequality follows from the fact that < / < 1. Combining the bounds 
above yields the stated inequality. 

For each n > 1 define a sub-probability measure X n (B) = \(BC\G n ) on ([0, 1},B), where 
G n = L)H n . The collection {A n } is tight, and is such that A n ([0, 1]) > r//6 for each n. 
There is therefore a subsequence n(l) < n(2) < • • • such that A n ( r ) converges weakly to a 
sub-probability measure v\ on ([0, 1],B). It is easy to see that v\ is absolutely continuous 
with respect to A, that z^i([0, 1]) > rj/6, and that the Radon-Nikodym derivative dv\jdX is 
is bounded above by 1. Define R\ = {x : (dv\/dX)(x) > 5}. From the previous remarks it 
follows that 

" < ^([o,i]) = t^dx = ( ^-dX + [ ^dX 



6 Jo dX J Rl dX J R c dX 



< [ ldX+ [ 5dX < X(Ri) + 5. (10) 

Jri Jri 

As 5 = r;/12, we have X(R\) > ry/12 > 0. This completes the first stage of the construction. 

Further Stages. Subsequent stages follow the general iterative procedure used to construct 
R\. Let U} ns , fn :S , Jn,si m n,s> H ns and G U:S denote the various quantities appearing at the 
nth step of stage s. In particular, let / n> i = f n be the n'th function produced at stage 1, 
and define J n ,i, m n x, -ff n ,i and G n \ in a similar fashion. 

Suppose that for some s > 2 the construction of the splitting sets R\, . . . ,R s -i is 
complete, and that we wish to construct the set R s at stage s. Let /i ;S be any element of 
J 7 , and suppose that /i jS , . . . , f n>s have already been selected. Define the join 

n s— 1 

Jn,s = V n V \J vr(/ M ) V xJiR^RC}. 

i=l j=\ 

It follows from the ergodic theorem and Proposition [2] that there exists a sample point 
to> n +i jS € fi, an integer m n+ i )S > 1, and a function f n +i,s € T such that 

A "n+i,s( lA . mn+ls ) < S X(A) for each A E J n , s (11) 

and 

A^>*(f n+hs :m n+ i >s )>r). (12) 

We may then define the join J n +i jS using / n +i iS and continue in the same fashion. For each 
n > 1 define the family 

H n>s = {i£J„ )S : A^+ 1 ^(/ n+liS -/ A :m n+1 , s ) > |a(^)} 
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and G ntS = U H U)S C [0, 1]. Lemma [3] ensures that X(G nt8 ) > r//6. 

As in stage 1, there is a sequence of integers n s (l) < n s {2) < ■ ■ ■ such that the 
sub-probability measures X rjS (B) = X(B n G ns r r \ s ) converge weakly as r — > oo to a sub- 
probability measure z/ s on ([0, 1],B) that is absolutely continuous with respect to A(-). Define 
R s = {x : {dv s /dX){x) > 5}. The argument in (|10p shows that \{R S ) > ??/12. In what 
follows, we need to consider density points of i? s . To this end, for each s > 1 let 

iL = ixeR,:lhn X « X - a > X + a)nRs) 



a^o 2a 

be the Lebesgue points of R s . By standard results on differentiation of integrals (c.f. The- 
orem 31.3 of Billingsley (1995)), we have X(R S ) = \{R S ) > n/12. 

5.2 Existence of the Intersection Tree 

Fix an integer L > 1. As the measures of the sets R s are bounded away from zero, there exist 

t 



positive integers sq < s% < . . . < sl such that X{C]j =0 R Sj ) > 0. Define the intersections 



Qi = n R 



L-l 



for I = 0,1,..., L, and note that Q\ C Qi+\. In what follows, B°, B and dB denote, 
respectively, the interior, closure and boundary of a set B C [0, 1]. The following result is 
a strengthened version of Proposition Q] that incorporates the sets Q\. Its proof completes 
the proof of Proposition [TJ 

Proposition 3. Suppose that Y{J- : X) > 7] > and that every element of C{T) is a finite 
union of intervals. Then there exists functions g\ , . . . , gi £ J 7 and a complete binary tree T 
of depth L such that each node t € T is associated with a subset Bt of [0, 1] subject to the 
following conditions: 

(a) For each internal node t £ T[l], the sets B t i and B t n associated with its children t' 
and t" are equal to non-adjacent n/5-segments of <?/+!• 

(b) For each node t £ T , the intersection Wt of the sets B s appearing along a descending 
path from the root to t has non-empty interior. 

(c) If ' t £ T[l] then the intersection W° n Qi is non-empty. 
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Proof of Proposition [3j Let T be a complete binary tree of depth L with root t$, and 
let B tQ = [0,1]. We will assign sets B t to the nodes of T on a level-by-level basis, beginning 
with the children of the root. We show below that there exists a function g\ € T ', and 
non-adjacent 7-segments U, V £ n(gi), such that U° PI Q\ and V° D Q\ are non-empty. The 
children of to may then be associated with U and V, in either order. To begin, choose a 
point x\ E Qo, which is non-empty by construction, and let e = 5/2(5 + 1). It follows from 
the definition of the sets R s , that there exists a\ > such that I\ = (a?i — cti,Xx + «i) 
satisfies 

A(IinQo) > (l-e)A(Ii) = 2oi(l-€). (13) 

To simplify notation, let k = sl. The last display and the definition of R K imply that 

r a 

u K (hnR K ) = -jrdX > 5X(hnR K ) > 2ai(l-e)5. 

JhnR K « A 

Let {n K (r) : r > 1} be the subsequence used to define the sub-probability v K . As I\ is an 
open set, it follows from the Portmanteau theorem that 

liminfA(/inG nK(r)jK ) > v K {I x ) > v K (h^R K ) > 2a x (l - e)8. 

Choose r sufficiently large so that A(Ii D G n / r \ K )) > 2ai(l - e)5 and 2~ nK ( r ) < 5a\/A. 
We require the following subsidiary lemma. Its proof is identical to Lemma 4 in pQ , but is 
included in the Appendix for completeness. 

Lemma 4. There exists a set A G H n t r \ R such that A Q I\ and A(^4nQi) > 0. Moreover, 
A is contained in Q\. 

Let <7i = f nK f r )+x K £ J~. By assumption, each element of 7r(<7i) is a finite union of 
intervals, and no random variable X{ takes values in the finite set U<7 g7r ( 9l )<9C. We argue 
that the set A identified in LemmaH](and therefore Q{) has non-empty intersection with the 
interiors of two non-adjacent segments of g\. As A has positive measure, and the boundary 
of each segment of g± has measure zero, it suffices to exclude the possibility that A intersects 
no segments, only one segment, or only two adjacent segments of g\. 

As \(A) > and the segments of g\ form a partition of [0, 1], A must intersect the interior 
of at least one segment of g±. Suppose that A intersects only one segment U = Sk(gi) of g\. 
Let h(x) = g\(x) — (k — 1)7, and note that < h(x) < 7 for each x G U. In this case, 

E( gi I A ){X) = Y, E{9iIaIc){X) = E( gi I A Iu)(X) 

Cer( S i) 

= 7 (k-l)X(A) + E(hI A )(X). (14) 
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Similarly, for each m > 1, 



777 771 777 

m z — ' m z — ' z — ' m * — ' 

i=l t=l CG7r(gi) i=l 

1 m -. m 

= 7 (fc-l)-E^M + -X>J^)M- (15) 



m * — ' m 

1=1 8=1 



Letting m = m n r r \ + i jK , we find that 
|A(A) < A-( 5l ./ A :m) 

< 7 (fc-l)A w (/ A :m) + max J - V>^)(^), E{hI A )(X) 1 

< 7 (£; -1)A W (J A :m) + 7 max J - ^4(1*), A(A) 1 

< 7 (fc - 1) A W (I A : m) + 7 (A(A) + A W (I A : m)) 

< A w {I A :m) + 1 \{A) 

< (<5 + 7 )A(A). 

Here the first inequality follows from the definition of H n r r \ K , the second follows from (JT] 
and (|15p . the third follows from the bound on h(-), and last follows from the definition of 
771. Comparing the first and last terms above, our definition of 5 = rj/12 and 7 = r?/5 yields 
a contradiction. 

Suppose finally that A intersects only two adjacent segments of gx, say U = Sk(gi) and 
V = Sfc + i(gi). Let h{x) be defined as above, and note that < h{x) < 27 for x £ U L)V. 
Arguing as above, we find that 

E( 9l -I A )(X) = 7 (fc-l)A(A) + E(hI A )(X), 

and that for each m > 1, 

1 rra 1 rra _. m 

m ^^ m *-^ m ^-^ 

8=1 i=l j=l 

Letting m = 77i n#c ( r )4.i K , the previous two displays, and arguments like those above, can be 
used to show that 

|A(A) < A w { 9l -I A :m) 

< y(k - 1) A W {I A : m) + 2 7 (X(A) + A W {I A : m)) 

< {l + 1 )A w {I A :m) + 2 1 \{A) 

< ((1 + 7 ) < J + 2 7 )A(A). 
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Comparing the first and last terms, the definition of 5 = rj/12 and 7 = 77/5 yields a 
contradiction, and we conclude that A intersects the interiors of two non-adjacent segments 
U and V of g%. This completes the assignment of sets to the children of the root to. 

Suppose now that for some I < L — 1 we have assigned sets Bt C [0, 1] to each node 
t of T having depth less than or equal to I, in such a way that properties (a) - (c) of the 
Proposition hold. There are 2 l nodes of T at distance I from the root. Denote these nodes 
by 1 < J ' < 2 l , and let Wj be the intersection of the sets B s appearing on the descending 
path from the root to of T to node j at level I. By assumption, W° fl Qi is non-empty: 
let Xj G W? fl Qi for each j G [2']. Select a/ + i > such that, for each j, the interval 
Ij = (xj — ai+i, Xj + ai+i) is contained in W? and satisfies 

KljnQi) > (l-e)X(Ij) = 2a, +1 (l-e). 

Let k' = sl-i and let {n K >(r) : r > 1} be the subsequence used to define the sub-probability 
v K i. For each interval Ij, 

liminf \{Ij C\G n , w K ,) > v K ,(Ij) > v K ,(IjnR K >) > 2a i+1 (l - e)5. 

where the last inequality follows from the previous display, and the fact that Qi C R K i. 
Choose r sufficiently large so that X(Ij n G n ,i r ), K ') > 2q^ +1 (1 — e)5 for each j = 1, . . . , 2 l , 
and2- n «'W <Sai +1 /4. 

Applying the proof of Lemma[l]to each interval Ij, we may identify sets A\, A2, ■ ■ ■ , A 2 i G 
H n ,(r) }K i such that \{Aj) > 0, Aj C Ij C W?, and j4j C Qi + \ for each j = 1, . . . , 2 l . Define 
9i+i = fn r (r)+i,K' ^ ^ < Arguments identical to those in the case I = above show that, for 
each j, there exist non-adjacent segments Uj, Vj of gi + % such that Aj n U? and ^4j n V? are 
non-empty. Assigning the sets Uj and V^- to the left and right children of j in T, in either 
order, ensures that property (a) of the proposition is satisfied. For the child t of node j 
associated with the set Uj we have Wt = Wj fl Uj. It follows from the fact that Aj C W?, 
Aj n U° / and Aj C Q z+1 that W t ° n Q;+i / 0, and therefore properties (b) and (c) of 
the proposition are satisfied. The argument for the other child of node j is similar. This 
completes the proof of Proposition [3j 

6 Proof of Proposition [2] 

Proof of Proposition [2j Fix L > 1 such that 2 L ~ 1 /K 2 > 4, and let T be the complete 
binary tree of depth L described in Proposition [TJ Suppose that each interior node in t G T 
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is labeled with the indices of the segments assigned to its children: if the segments Sk(g r ) 
and Sk'(g r ) of g r are assigned to the children of a node t E T[r — 1], then t is assigned the 
label £{t) = (k, k!) € [K] 2 , where [K] = {1, . . . , if}. 

Let Lo = L — 1. By an elementary pigeon-hole argument, there exist non-adjacent 
integers ko,k' E [K] such that the set So of nodes t E T[Lq] with £(t) = (ko,k' ) has 
cardinality at least 2 L ° /K 2 . (Here K 2 is an upper bound on the number of non-adjacent 
pairs k, k! E [K].) Let u Q = |~log 2 if 2 + 1] . 

It follows from Lemma [2] and an additional pigeon hole argument that there exists an 
integer L\, a pair k\, k[ E [K] of non-adjacent integers, and a set of nodes S% C T[Li] with 
the following properties: (i) Lq — uq < L\ < Lq — 1; (ii) £(£) = (&i,&i) for every t E Si; 
(iii) for every t £ Si, each child of t is an ancestor of So; an d (iv) |Si| > 2 L ° / 'ALK^ . In 
particular, inequalities (i) and (iv) imply that 

f2 L °~ Ll \ ( 1 \ 2 L ° 

1*1 a 2 ' (ll^-j a 2 ' (JSP) a SEP- (16 » 

If the last term above is greater than or equal to 4, then we may apply Lemma [2] again 
to find an integer Li and a set of nodes S2 C r[L 2 ] with properties analogous to (i) - (iv) 
above. Continuing in this fashion, we obtain integers Lq > L\ > • • • > Lr > 0, sets of 
nodes S r C T[L r ], and non-adjacent pairs /c r ,/c^. G [if] such that for 1 < r < R and for 
every node t E S r , £(t) = (k r , k' r ) and both children oft are ancestors of S r _i. In particular, 
using arguments like those in ()16p . one may show that 

/ 1 \ 2 L ~ l 

\S r \ > 2 Lr — — ^ -a > 



(2LK 2 ) r K 2 J ~ 4 r ■ K 2r + l ■ (2LK 2 y< r+i y 2 ' 

and therefore R = R{L) can be taken to be the largest integer r > 1 for which the last term 
above is greater than 4. In particular, R(L) tends to infinity with L. 

From the construction above, and an additional pigeon-hole argument, we may identify 
an integer N = N(L) > R(L)/K 2 and a subsequence i$ < i\ < ■ ■ ■ < i^ of Lr, L_r_i, . . . , Lq 
such that {ki ., k[.) = (k, k') for a fixed non-adjacent pair (k, k') € [K] 2 . From the associated 
node-sets Sj , . . . , Si N one may construct an embedded binary subtree T of T all of whose 
node labels are equal to (k,k'). To see this, let the root of T be any node s E Sj . At 
each level < r < N — 1 let the left and right children of £ € T [r] be (necessarily distinct) 
descendants in Sj r+ i of the children of t E T. Then it is easy to see that T is a complete 
binary tree of depth N. 

For r = 0, . . . , JV — 1 let h r = gi r +i- By construction, each node t E T [r] is contained 
in Si r and has label £(t) = (k,k'). Thus the children t' and t" of t in T are associated 
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with the segments Sk{h r ) and Sk>(h r ) of h r . For each terminal node t G T let Wt be the 
intersection of the sets B s appearing on the descending path (in T) from the root of T to t. 
The construction of T ensures that every member of {Wt '■ t £ T } is contained in a unique 
element of the join 

J = {s fc (/i ),s fc /(/io)} V ••• V {si(hN-i),si>(hN-i)} 

Moreover, by Proposition [TJ each set Wt has non-empty interior, and positive Lebesgue 
measure, and the same is therefore true for each element of J. As N(L) tends to infinity 
with L, the lemma follows. 

7 Proof of Theorem [2] 

Proof of Theorem [2} Let T and X be as in the statement of the proposition. Then 
Y{T : X) > rj > 0. Let C(T) be the countable family defined in ([5]). Without loss of 
generality, we may assume that T contains the identity function fo(x) = x, and therefore 
C(J-) satisfies the shrinking diameter condition of Lemma iBl Let the sets V\, V2 C [0, 1] and 
map (/>(•) be as in the statement of Lemma [Bj 

Define random variables Yi = <j)(Xi) for i > 1. Then the process Y = Y\,Y2,... 
is stationary and ergodic with Y{ ~ A. For each / £ T define an associated function 
gf : [0, 1] -> [0, 1] via the rule 



9f( 



u 



(/o^Xu) iiuev 2 



if u € V£ 



and let Q = {gj : f E J 7 }. Arguments like those in Section [2] above show that T m (Q : Y) is 
equal to T m (J- : X) with probability one for each m > 1, and consequently Y{Q : Y) > 77. 

Let the constants 7 (equal to ry/5) and if, and the segments Sfc(/), be defined as in (jl|), 
and let e = T(Q : Y) — -q > 0. Choose a finite sequence of rational numbers = ao < a\ < 
■ ■ ■ < a>N = 1 that includes {jk : k = 1, . . . ,K — 1} and is such that maxj [oj — aj_ij < e/2. 
Define intervals Uj = [aj_i, aj) for j = 1, . . . ,iV — 1, and let [/at = [oat_i, 1]. Using ([1] 
one may verify that for each gf G Q, 

- lTT \ Hr'Uj) X2<j<N 

1 { <t>{f- x Uj)\JV$ ifj = l 

where the second condition results from the fact that the interval U\ contains zero. 
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Let hi be the family of subsets of [0, 1] that are equal to finite unions of intervals, and let 
A = B denote the fact that A and B are equivalent mod 0, in other words, X(AAB) = 0. 
Fix a function / £ J 7 , and let gf be the associated element of Q. Lemma iBl and the fact that 
A(V r 2 c ) = imply that there exists sets C±, . . . , Cn € U such that gj Uj = Cj for 1 < j < N. 
If i ^ j then 

A(anC,-) = XigfUiHgfUj) = XigJ 1 (U t n Uj)) = 

so that Ci and Cj can intersect only at the endpoints of their constitutive intervals. It follows 
that the function hf(u) = Ylj=i a j-^^Cj(u) approximates gj in the sense that \gj(u) — 
hf(u)\ < e/2 with probability one. Moreover, hi [a, b) G Li for all rational a,b. 

Let H = {hf : / S J-} be the family of simple approximations to the elements of Q. 
Then C("H) is contained in U, and a straightforward argument shows that T(T~L : Y) > ij. Fix 
L > 1. As 7~L satisfies the conditions of Proposition [21 there exist functions f%, . . . , fi € T 
and a pair of non-adjacent integers k, k' € [K] such that the join 

L 

A = \/{sk(h fe ),s k/ (h fe )} 
e=i 

has 2 L elements, each with positive measure. In order to obtain a full join for the segments 

of /i, . . . , /i, we examine how the segments of hf are related to those of /. To this end, let 

i < j be such that en = (k — 1)7 and cij = kj. Then for every / E J 7 , 



s k (hf) = h f x [(k - 1)7, £7) = /^[a^ay) 
i-i i-i 



= gf[{k- 1)7, ^7) 

- ^r 1 ^- 1)7,^)) = #**(/)). 

The same argument applies to Sk>(hf), and therefore every element of J/i is equivalent mod 
zero to an element of the join 

L 

J'h = \/Msk(fe)),Hsk>(fi))}- 

£=1 

As (j) is a bijection almost everywhere, every element of J' h is equivalent mod zero to a set 
of the form (j)(A), where A is an element of the join 

L 

J f = \f{ s k(fe),s k '{fe)}- 



e=i 
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As each cell of J' h has positive Lebesgue measure, the same is true of the cells of Jf. In 
particular, Jf has (maximum) cardinality 2 L . As L > 1 was arbitrary, Theorem [2] follows 
from Lemma [TJ 

A Appendix 

A.l Proof of Lemma [4] 

The proof of Lemma d] appears in [I]; we reproduce it here for completeness. 
Proof: Let G = G n t r \ K . The choice of n K (r) ensures that 

(l-e)JA(Ii) < A(IinG) 

= A(AnQinG) + A(/inQ^nG) 

< A(iinQinG) + A(JinQf) 

< A(JinQinG) + eA(Ji) 

where the final inequality follows from f)13|) and the fact that Qo ?= Ql- It follows from the 
display and the definition of e that A(/i n Q\ fl G) > 8oc\. As the collection of sets used to 
define the join J nK ( r ) jK includes the dyadic intervals of order n K (r), each element A of the 
join has diameter (and Lebesgue measure) bounded by 2~ nK ^ r > < <5ai/4. These last two 
inequalities imply that 

5 ct\ 

,AWl \\A) + 6 
A 



5ai < X(I 1 nQ 1 nG) < J^A(QinA) + 2- 



where the sum is over A € H n i r \ K such that A C I\. In particular, it is clear that the 
sum is necessarily positive, and the first part of the claim follows. Moreover, for any set 
A E H n i r \ K the definition of the join J nK t r ) K requires that A be contained in either R^ 
or i?£, for each j = 0, . . . ,L — 1. If X(A n Qi) > then necessarily A n Q\ ^ 0, and these 
containment relations imply that iCQj. This completes the proof of Lemma U] 
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